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IN THE CLAIMS 

Kindly amend the claims as follows: 

1 . (Currently amended) A speaker segmentation method for associating an at least one 
segment of speech for each of at least two sides of an at least on e a summed audio 
interaction, with one of the at least two sides of the summed audio interaction, using 

^ additional information, the method comprising: 

a receiving step for receiving the at l e ast one summed audio interaction from 
a capturing and logging unit; 

a segmentation step for associating the at least one segment with one side of 
the at least one summed audio interaction, the segmentation step comprising 

a parameterization step for transforming a speech signal into a set of 
feature vectors and dividing the set into non-overlapping segments; 

an anchoring step for locating an anchor segment for each of the at 
least two sides of the summed audio interaction, the anchoring step 
comprising: 

selecting a homogenous segment as a first anchor segment; 
constructing a first model of the homogenous segment; and 
selecting a second anchor segment such that its model is 
different from the first model; and 
a modeling and classification step for associating at least one second 
segment with each side of the summed audio interaction; and 
a scoring step for assigning a score to said segmentation. 

2. (Currently amended) The method of claim 1 wherein the additional information is 
at least one item selected from the group consisting of: computer-telephony- 
integration information related to the at l e ast on e summed audio interaction; spotted 
words within the at l e ast on e summed audio interaction; data related to the at l e ast 
eae summed audio interaction; data related to a speaker thereof; external data 
related to the at l e ast on e summed audio interaction; e* and data related to at least 
one other interaction performed by a speaker of the at l e ast on e summed audio 
interaction. 

3. (Original) The method of claim 1 further comprising a model association step for 
scoring the at least one segment against an at least one statistical model of one side, 
and obtaining a model association score. 



2 



10/567,810 



4. (Currently amended) The method of claim 1 wherein the scoring step uses 
discriminative information for discriminating the at least two sides of the summed 
audio interaction. 

5. (Original) The method of claim 4 wherein the scoring step comprises a model 
association step for scoring the at least one segment against an at least one 
statistical model of one side, and obtaining a model association score. 

6. (Original) The method of claim 5 wherein the scoring step further comprises a 
normalization step for normalizing the at least one model score. 

7. (Currently amended) The method of claim 4 wherein the scoring step comprises 
evaluating the association of the at least one segment with a side of the summed 
audio interaction using second additional information. 

8. (Currently amended) The method of claim 7 wherein the second additional 
information is at least one item selected from the group consisting of: computer- 
telephony-integration information related to the at least one summed audio 
interaction; spotted words within the at least one summed audio interaction; data 
related to the at l e ast on e summed audio interaction; data related to a speaker 
thereof; external data related to the at l e ast on e summed audio interaction; er and 
data related to at least one other interaction performed by a speaker of the at least 
one summed audio interaction. 

9. (Original) The method of claim 1 wherein the scoring step comprises statistical 
scoring. 

10. (Original) The method of claim 1 further comprising: 

a step of comparing said score to a threshold; and 

repeating the segmentation step and the scoring step if said score is below 
the threshold. 

1 1 . (Currently amended) The method of claim 10 wherein the threshold is 
predetermined, or dynamic, or depends on: information associated with said at l e ast 
one summed audio interaction, information associated with an at least one speaker 
thereof, or external information associated with the summed audio interaction. 

12. (Cancelled) 

13. (Currently amended) The method of claim 1 wherein the anchoring step or the 
modeling and classification step comprise using second additional data. 

14. (Currently amended) The method of claim 13 wherein the second additional data is 
at least one item selected from the group consisting of: computer- telephony- 
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integration information related to the at least on e summed audio interaction; spotted 
words within the at l e ast one summed audio interaction; data related to the at least 
ene summed audio interaction; data related to a speaker thereof; external data 
related to the at l e ast on e summed audio interaction; or and data related to at least 
one other interaction performed by a speaker of the at l e ast on e summed audio 
interaction. 

15. (Currently amended) The method of claim 1 further comprising a preprocessing 
step for enhancing the quality of the summed audio interaction. 

16. (Currently amended) The method of claim 1 further comprising a speech/non- 
speech segmentation step for eliminating non-speech segments from the summed 
audio interaction. 

17. (Original) The method of claim 1 wherein the segmentation step comprises scoring 
the at least one segment with a voice model of a known speaker. 

18. (Currently amended) A speaker segmentation apparatus for associating an at least 
one segment of speech for each of at least two speakers participating in an at l e ast 
one audio interaction, with a side of the interaction, using additional information, 
the apparatus comprising: 

a segmentation component for associating an at least one segment within 
the audio interaction with one side of the at l e ast one audio interaction, the 
segmentation component comprising: 

a parameterization component for transforming a speech signal 
into a set of feature vectors and dividing the set into non-overlapping 
segments; 

an anchoring component for locating an anchor segment for each 
of the at least two sides of the audio interaction, the anchoring 
component selecting a homogenous segment as a first anchor segment, 
and a second anchor segment having a statistical model different from a 
statistical model associated with the first anchor segment; and 

a modeling and classification component for associating at least 
one second segment with each side of the audio interaction; and 
a scoring component for assigning a score to said segmentation. 

19. (Currently amended) The apparatus of claim 18 wherein the additional information 
is at least one item selected from the group consisting of: computer-telephony- 



4 



10/567,810 



integration information related to the at l e ast on e audio interaction; spotted words 
within the at l e ast one audio interaction; data related to the at l e ast on e audio 
interaction; data related to a speaker thereof; external data related to the at l e ast on e 
audio interaction; er and data related to at least one other interaction performed by a 
speaker of the at l e ast on e audio interaction. 

20. (Currently amended) A quality management apparatus for interaction-rich speech 
environments, the apparatus comprising: 

a capturing or logging component for capturing or logging an at least one 
audio interaction in which at least two sides communicate; 

a segmentation component for segmenting the at least one audio interaction, 
the segmentation component comprising: 

a parameterization component for transforming a speech signal 
into a set of feature vectors and dividing the set into non-overlapping 
segments; 

an anchoring component for locating an anchor segment for each 
of the at least two sides of the at least one audio interaction, the 
anchoring component selecting a homogenous segment as a first anchor 
segment, and a second anchor segment having a statistical model 
different from a statistical model associated with the first anchor 
segment; and 

a modeling and classification component for associating at least 
one second segment with each side of the at least one audio interaction; 
and 

a playback component for playing an at least one part of the at least one 
audio interaction. 

21. (Cancelled) 

22. (Previously presented) The method of claim 1 wherein the homogenous segment is 
selected by spotting a predetermined phrase. 

23. (Cancelled) 
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